Optional Stopping
نویسنده
چکیده
Introduction Suppose you are determined to “prove” that green apples cause cancer. An Optional Stopping strategy (OS) is where you keep looking sampling experimental data until the observed correlation between eating green applies and cancer is significantly different from 0 (where “significantly” means that the null hypothesis is rejected by standard statistical tests). That is, you follow a rule that says “Don’t stop until you reject the null hypothesis”. This is also the best strategy for confirming the existence of UFOs or establishing the phenomenon of extrasensory perception (ESP) (see Feller, 1940, for the confutation of this tongue-in-cheek assertion). If the data are ‘noisy’ (and whose data are not?), then this will probably always work in principle, so not always in practice because you won’t live long enough to collect enough data. There are two schools of thought about optional stopping examples of the kind that I consider (see Robbins, 1952, for a more general discussion). The classical hypothesis testers say that it is a bad strategy if the probability of falsely rejecting the null hypothesis when it is true is 1. Some Bayesians, and likelihood theorists, say that it all that matters is how well the hypotheses fit the data, and it makes no difference whether you collect n data by an OS strategy, or if you collect n data with the prior intention of stopping at a sample size of n (strategy FS). About the only thing that has never been said about OS is that it is better than FS (with the same n). What follows is a series of computer simulation comparing an OS strategy with an FS strategy. The first simulation assumes that the null hypothesis is true, so that rejection of the null hypothesis is always a mistake. When I initially ran the simulation, I found that I had to wait too long for my computer finish doing its experiments (despite its running at 450 MHz). So, I set an upper limit of 2000 data points. For me, 2000 without stopping counts as “no outcome”. As you will see from the results below, the experiment had “no outcome” 40% of the time. The next two simulations show how things change when the null hypothesis is false. The results were initially surprising to me. It turns out that OS can be a more reliable method than FS most of time. In the final section, I explain this odd result in terms of an easy-to-understand analogy.
منابع مشابه
Optional stopping: no problem for Bayesians.
Optional stopping refers to the practice of peeking at data and then, based on the results, deciding whether or not to continue an experiment. In the context of ordinary significance-testing analysis, optional stopping is discouraged, because it necessarily leads to increased type I error rates over nominal values. This article addresses whether optional stopping is problematic for Bayesian inf...
متن کاملOptional Polya Tree and Bayesian Inference
We introduce an extension of the Pólya tree approach for constructing distributions on the space of probability measures. By using optional stopping and optional choice of splitting variables, the construction gives rise to random measures that are absolutely continuous with piecewise smooth densities on partitions that can adapt to fit the data. The resulting “optional Pólya tree” distribution...
متن کاملOptional Processes with Non-exploding Realized Power Variation along Stopping times Are Làglàd
Abstract We prove that an optional process of non-exploding realized power variation along stopping times possesses almost surely làglàd paths. This result is useful for the analysis of some imperfect market models in mathematical finance. In the finance applications variation naturally appears along stopping times and not pathwise. On the other hand, if the power variation were only taken alon...
متن کاملDoob , Ignatov and Optional Skipping
A general set of distribution-free conditions is described under which an i.i.d. sequence of random variables is preserved under optional skipping. 1. Introduction and motivation. This paper discusses a general set of conditions under which an i.i.d. sequence of random variables ξ 1 , ξ 2 ,. .. , taking values in a measurable space (X, B), with common distribution F , is preserved under " optio...
متن کاملStopping rules matter to Bayesians too
This paper considers a key point of contention between classical and Bayesian statistics—the issue of stopping rules, or more generally, outcome spaces, and their influence on statistical analysis. Firstly, a working definition of classical and Bayesian statistical tests is given, which makes clear that i) once a test has been conducted and an outcome recorded, only the classical approach to in...
متن کاملAlmost the Best of Three Worlds: Risk, Consistency and Optional Stopping for the Switch Criterion in Nested Model Selection
We study the switch distribution, introduced by van Erven, Grünwald and De Rooij (2012), applied to model selection and subsequent estimation. While switching was known to be strongly consistent, here we show that it achieves minimax optimal parametric risk rates up to a log logn factor when comparing two nested exponential families, partially confirming a conjecture by Lauritzen (2012) and Cav...
متن کامل